Triphone State-Tying via Deep Canonical Correlation Analysis

نویسندگان

  • Weiran Wang
  • Hao Tang
  • Karen Livescu
چکیده

Context-dependent phone models are used in modern speech recognition systems to account for co-articulation effects. Due to the vast number of possible context-dependent phones, statetying is typically used to reduce the number of target classes for acoustic modeling. We propose a novel approach for state-tying which is completely data dependent and requires no domain knowledge. Our method first learns low-dimensional embeddings of context-dependent phones using deep canonical correlation analysis. The learned embeddings capture similarity between triphones and are highly predictable from the acoustics. We then cluster the embeddings and use cluster IDs as tied states. The bottleneck features of a DNN predicting the tied states achieve competitive recognition accuracy on TIMIT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

State tying for context dependent phoneme models

In this paper several modi cations of two methods for parameter reduction of Hidden Markov Models by state tying are described. The two methods represent a data driven clustering triphone states with a bottom up algorithm [3, 9], and a top down method growing decision trees for triphone states [2, 10]. We investigate several aspects of state tying as the possible reduction of the word error rat...

متن کامل

Rule-Based Triphone Mapping for Acoustic Modeling in Automatic Speech Recognition

This paper presents rule-based triphone mapping for acoustic models training in automatic speech recognition. We test if the incorporation of expanded knowledge at the level of parameter tying in acoustic modeling improves the performance of automatic speech recognition in Slovak. We propose a novel technique of knowledge-based triphone tying, which allows the synthesis of unseen triphones. The...

متن کامل

Smoothing and tying for Korean flexible vocabulary isolated word recognition

For large vocabulary recognition system, as well as for flexible vocabulary applications using hidden Markov model(HMM), parameter smoothing and tying have been used to increase the reliability of models. This paper describes bottom-up and topdown clustering techniques for state level tying. This paper also describes a method of applying parameter smoothing to the clustered states and covarianc...

متن کامل

Effective Triphone Mapping for Acoustic Modeling in Speech Recognition

This paper presents effective triphone mapping for acoustic models training in automatic speech recognition, which allows the synthesis of unseen triphones. The description of this data-driven model clustering, including experiments performed using 350 hours of a Slovak audio database of mixed read and spontaneous speech, are presented. The proposed technique is compared with treebased state ty...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016